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This 32-itea bibliography was coapiled to provide 
access to research and discussions of donain referenced testing* it 
' iTs not lisited to any educational level, nor is it confined to any 
, specific curriculun area. Five data bases vere searched by ccaputer« 
A conpUter search of the Educational Besources Inforaation Center 
<SBZC) 'data base yielded docunents announced in Besources in 
Education and journal articles indexed in Current Index „t.o Journals 
in Education. Also searched by conputer vere Psychological Abstracts, 
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Conprehensive Dissertation Abstracts. A stibject index is provided. 
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I 



The' Educational Resources^ Iliformation'iCenter (ERIC) is operated.* 
by the National Institute of Education of, the United States'" Department 

'of Health, Education, and Welfare. .It is |an information System dedicated 
to the improveiaent of education through the dissemination of conference 
proceedings, instn^ctic^al programs, manuals, position papers, pto^ram 
descriptions, research and technical 'reports, literature reviews, and - 
other types of material^ ERIC aids school administrators^, teachers, * 
Researchers, inf^fmation specialists, professional organizations, stu- 
dents, and, others ;'iu locating and using information which wa,s previously 

* unpublished or whl^ would not be yidely disseminated othen^ise! ^ \ 

The ERIC Clearinghouse on Tests, Measurement, and Evaluatiott (ERIC/ 
TM) acquires and processes documents and journal articles within the 
scope of interest' of the Clearinghouse for announcement in ERIC's in- v: 
dex and abstract jc^urnals: Resources in Education (RIE) and Current 
Inde^C to Jouriials in Education (CUE). » * 

Besides processing documents and jtJUrnal articles, the Clearingr - 
house has another major function: information analysis and synthes^ds. 
The Clearinghouse prepares- bibliographies, literature reviews, state- \ 
of-the-art papers, and ot:her interpretive reports on topics in its ' . . 
area of 'interest. . - ^ . 



ABOUT THE BIBLIOGRAPHY ' 

This bibliography was compiled to provide teache^, researchers, 
^d evaluators of educational achievement tests access to information 
found in jourT\/^l articles, tesearch papers, bdo)cs and- dissertations 
concerning domain referenced testing. The primary purpose of these 
^ests'is estimate* the extent ,to which a student has attained or 

. , " " 

retained rop intended learning. 'outcomes of a particular segment of ' in- 
struction. Domain referenced testing (DRT) is particularly useful i^ 
ascertaining the learner's strengths and weaknes94is in a specific 
sutject area.. This bibliography is not limited to any educati^onal 
level, nor confined to any specific curriculum area. Five data bases 

'^were searched by computer for this bibliography. 

■ ^ I " 

ERIC data base yielded/ dpcuments announced in Resources in 

r 

Education and journal articles. indexed in Current Index to Journals 

in Education which covers over 700* education-related iournals. 
^ • > 

Psychological Abstracts ^ an index pxQviding summaries of literature 
in psychology and related (^sciplines, covers over iournals * 
technical reports, monographs, and other scientific documents. Ex- 
G^ptional Child Education Abstracts (CEC)-, a d^ata base concerned 
with published and unpublished literature on the education of handi- 
•capped and gifted children, covers such sources as books, journal 
articles teaching materials ^ and reports. Sociological Abstracts ,*^ 
an index covering literature in sociqlogy ''and related disciplines, 
scans^over 1200 journals and serial publications a year. Comprehen- 
sive Dissertation Abstracts is a definitive subject, title, and 
autho^^ guide to virtually every American dissertation accepted at 



an accredited institution sihce 1861, and to thousands of Canadian 
• »■ 

dissertations. - , 

For ERIC -documents (those with an ED number appearing at *the 
end^of the bibliographic citation) the following information is pre^ 
sented when available: Personal or corporate author, title, date of 
publication, number of pages, and availability information. These 
documents may be purchased in hard copy or in microfiche from the 
ERIC Docum^t Reproduction Service (EDRS) . Price information and an> 
order form are appended. However, ERIC m'icrofiche collections are 
'available at approximately 590 locations throughout the country, 
and most of these collections are open to the public. If you are un- 
able to find a collection in your area,, you may write ERIC/TM for a* 
listing. Documents with a UHI O^rder number can be obtained .from: 
University Mi^crofilms International, P.O. Box 1764, Ann Arbor ^ 
Michig^ 48106. 

Journal articles (those entries ^a{>{iearing with an EJ number 
or dtherwise identified as journals by the bibliographic citation) , 
are not available* from EDRS. However, most of these journals are 
>readily available in college and university/ libraries as well as 



some *large public libraries. - 

* All entrie^, are /ii*ted alphabetically by author and are numbered. 
An abstract, or in' th^ cas^ of most journal articles, a shorter anno- 
tation, is provided for each entry. A subject ^ndex consisting of 
ERIC '.descriptors and identifiers reflecting major emphasis is also 
provided. "^.Numbers appearing, in the index refex to entries. 



Baker, Eva 1^. Beyond Obj^ectives: ^ Domatn-rJRef erenced Tests for ^r- 
Evaluation arjd Instructional ImprQvement* .. Educational Technology , 
-^pl^ 14, No. 6, Jimje 1974, pages 10-16^ , . 

"« * - -"'^^^ . ' " ' . \ ^ . \ 

This article desQr:^bes th^^^j Inadequacy of current; methods of assessing 
attainment of , behavioral objectives. It Is suggested that domain- * 
referenced testing tie more frequently utilized. /"Puid^elijie^s for 
preparing domains. are presented, wltK^illustiTat^ve examples; 

Baker, Eva L. Using Measurement to Improve ^Instruction ^ September 
1972. 8 pages. ED i)^5H762. ; • : A: 

Instructional Improvement within the context^ of dfiterlon-ref erenced 
and norm-referenced tests 1& descri|/ed'. Su8&7<:ate§orles over'emph^ize/ 
test ititerpretation^rather than design characteristics '^of ♦achieve-^ -* 
ment tests. Data from mosV measurement ^ituatio^n^ may be reported or 
interpreted either accordlr^g to criterion- or noxm-ref erencfed standards 
How the test is developed and what it represent is of critical* 
importance, ^he paper proposes alternative" c6nceptualizatioti% of 
t^st Resign: conktruct-ref erenced, objectives^referenced, ^d ' * 
domain-referenced.,. Using student data, the tocher needs to identify 
deficiencies in achievement, possible explanations ,-;and remedies, 

^ and to put the remedies into operation.4 An analysis of ,the utility 
of each test type results, in the appraisal th^t-^'^ciomalri referenced 

'tests provide the most information for "teachers, and , therefore are 
the most.. desirable as data sources for Insfructrotial -inrorovement . 

> However, because of lack of knowledge about instructicHi, poor training 
in available instructional principles , .and lack of resources to ,; 
encourage changes in instructional habits, it is ^Ip^ncluded that 
instructional imp rov em^ent , . even i^ measuremetijt. considerations, were 
satisfied, is not imninentv , ^ ^ ^ * " 



Besel, Ronald, and Okada, Masahito. ' The Deve^opmeht of Domain- 
Referenced Tests for an^ Obifectiv^-Based Reac^ng'jliogram.^ .^ April 1974. 
8 pages. ED 093 918. 7<, ' . « . ^ » 

■ - -^ . , , ^ 

Criteria for the selection of item forms, content domains, and sampling 
procedure^ for program specific, &main-ref erenced tests are developed. 
The primary purpose of these >test^. is to estimate the extent tp which ♦ 
individual pupils have attained or\^ retained th^ intended learning \ - 
outcomes of a particular segment of inst^ifdtion. Tests developed 
for the tryot^t of the SWRL Jleading Ftogram! illustrate the api^llca^ix^n^^ 
of the criteria* A variety of^critlcal readjltg skills is assessed. 
The use :and 'pJO^^eHtial value of facet designed test$ for assessing 
word r'eeogniti*on and novel word decoding is^^^cribed. Error type • 
"scores provide potentially valuable inforjnation on which, to base 
prescriptions /of supplementary in&truction. ^ . 

Denham, Catolyn H^. Critpripn-Rfef erenced^j .Domain-Bl'ef erencJdland Norm- 
Referenced Measurement : A BarauLlax View. Educational Te|hAoloi;v , 
Vol. 15,. No. 12,^.December 1975, pages 9-12^1^ 



Three measurement techniques, criterion-referenced (CR) , domain- 
referenced (DR) , and norm-referenced (NR) tests are defined, analyzed, 
and compared in this article* Explained and underscored is the 
clear separation of criterion-referenced tests from domain-referenced 
tests,' a separation seldom made by education experts • Although both 
"tests utilize a random sample of items drawn from ^a larger domain of 
items, criterion-referenced tests cpmpare the examinee to a specific 
cxiterion, whil.e domain-referenced tests are more concerned with 
ascertaining thd*" examinee's individual strengths and weaknesses in 
a particular 'subject area. Also, ^the value of^ combining the three 
types of tests (CR, DR, and NR) 'to construct new tests is sug&ested. 
Fin^ally, the need for reliable and valid item selection pcoc'edure? 4 
is stated. Ppssibl^e selection processes for CR, DR, and UJl tests 
are outlined, as well as for the four combinatfon-type tfests suggested 
by the author. 




Duncan, Ann Dell. Tracking Behaivioral Grpwth: Day-to-Day Measures of 
Frequency Over Domains of Performancej. Educational Technology ,^ 
^Vol. 14, No. 6, Jiine 1974, pages 

In this article ways in which domain-referenced testing may be 
utilized in a program of facilitating personal growth are described. 
The personal growth program described utilizes behavioral principles 
in helping participants to change. Its /use together with the 
innovative testing approach is discussed in terms of implementation 
and advantages. 

Durnin, John, and Scandur-a, Joseph M. ' An Algorithmic Approach to 
Assessing Behavior Potential: ^Comparison with Item Forms and 
Hierarchical Technologies. j Journal of Educational Psychology , 
Vol.. 65, No. 2, October 1973, pages 262-272. 



In this study two item form technologi'es*, the item forms technology 
(domain-ref ereprfied testing-) of Rively (1968)^, and the hierarchical or 
stratified forms technology of Ferguson (1969), were compared with 
an algorfithm-based technology for assessing behavior potentials Bases 
for comparison were (a) relative effectiveness in predicting per- 
formance on individual 'test items, based on performance on items 
identified according to respective technologies; (b) relative power 
(generalizability) ; Xc) i/elative efficiency (number of items); and 
(d) relative validity- of item hierarchies. "lyo parallel tests on 
column subtraction were administered to 25 subjects. Test performance 
was analyzed according to each technology. Algorithmic, technology 
(a) better predicted individual subjects failiire on individual 
second test items, (b) had higher generalizability levels, (c) was 
more effiriei^t, and -(d) had higher validity indices 6n hierarchical 
ordering of /Cas.ks than item form technologies. Implications^for ' 
diagnostic testing and remediation were discussed. 



7. Fi^rguso^, Ric)tjard L. , and fse-Chl, Hsu. The Application of * 
^ Item Generators for Individualizing Mathematics Testing and In- * 

structlon . 'May 1971. 21 pages. EP 053 ^35. 

Described is a, procedure for ^utilizing a'-colffpulEer to generate , 
domain-referenced tests in#mathematlcs. The* procedure can be 
adapted for* use in testing" and instructional programs in either 
an on-line or off-line mode. It requires specification of the 
-objectives of interest in behavioral t^rms and jgrouping them into 
sets that ;Qhare a common content. Addition, mtAtiplication, and 
fractions are examples of possible groupings. To implement the 
procedure, one of the, sets of objectives resulting from tHe groupi-ng 
process is selected, and item forms representative of the behaviors 
implied by each objective in the set arfe specified. 'Then an item 
" generator is developed that facilitates the construction of items 
representative of all it^m forms so identified. Gd*v^n an on-line 
computer capability, the authors describe how it is possible to 
use the proposed item generator for assisting measurement and ' 
instruction in an individualized mathematics program. 

8. Geislx^ger, Kurt F. A Systems Approach to Item Production and Review 
in a Computer Managed Instruction Project .' April 1976. 20 pages. 
ED 121 280. 

^ An item generation procedure is described which was utilized in. 
the development of Computer Managed Review and Examination cou^rfes 
for the education of nurses in remote areas. The major j^g^hases are 
y the processes of domain definition, item writing, and item. edition. 
^Specific discussion is presented concerning methdds of item con- 
struction to assess technical vocabulary, concept learning, and * 
the application of nursing principles to the solution <of problems. 
The entire test construction procedure is briefly reviewed; this 
procedure includes numerous quality cheeky to insure the production 
' of both high calibfer instructional materials and domain-referenced 
tests.* The criteria used at various editing and review stages are 
mentioned. An initial evaluation of the items is made, and problems 
inherent in the item generation procedure are offered. 

9. Haladyna, Thomas. An Analysis of Two Procedures for Decision Making 
When Using Domain-Referenced Tests ! >Vpril 1975. 22 pa^es. . 

' ED 104 957. 

A central problem for the user ^ domain-referenced tests in instruction 
is deciding who has passed and who. has failed. Two procedures were 
presented and discussed. The first, employing classical test theory, 
„ , .was found to be more^ useful for larger , domains- and where the passing 
standard is 70 percent^ or less. The sampling procedure suggested by ^ 
Millman (197>)'.was fdtind to be more applicable when tjie test size 
approximates the size of the doma^?^, ^ Neither procpedure appears useful 
when the passing standard is high. In light of the large numbers of 
examineres c!^assified as uncertain when real &Bst ^ata is used, t^t was 
concluded that neither procedure' ofrfers^much to decision making in 
systematic individualized inatruction.. 
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10. Haladyna, Thomas. The Paradox of Qrlterloi^-Ref erenced Measurement . 
-April 1976. 25 pages. ED 126 1-55. ' ' 

trh^ existence ^of criterion-referenced (CR) measurement is questioned 
in this paper. Despite beliefs that differences exist between two 
alternative forms of measurement CR and Norm Referenced (NR) , an 

, • analysis of philosophijcal and psychological descriptions of measure- 
ment, as well as a growing number of empirical studies, reveal that 
the common distinctions drawn between CR and NR measurement focus on 
what ocArs'prior to and following measurement^ namely the writing 
of items and the intei^reting of test scores. In this respect, the 
Use of the term "criterion-referenced measurement" is paradoxical. . 
The purpose, method of construction, and usefulness of domain- 

^ referenced tests are also discussed i^ this article, the domain- 
referenced tests being treated as a *par{: icular type^ of criterion- 
referenced measurement. 



11. Haladyna, Thomas M. The Quality of Dom^in-Ref erenced Test Items • 
April 1976. 28 pages. ED'129 846. 

' The objectives of this study were to first determine whether or \ 
not the empirical item analysis of don^in referenced tests (DR) was 
justified; and second, in the event that it was, which of a set of 

•* recommended procedures was most effective for determining item 
quality. Tfie analysis that followed led to the conclusion that • 
empirical procedures w^re highly desirable. 'When these empirical 
procedures were applied to test data, the results indicated^ that 
four ^iff erent techniques provided almost ^identical information: 
RascK statistics, instructional sensitivity indexes, traditional 
statistics, and Baysian indexes. Based on these results, it woultf* 
seem that ^ny one of these four would sewe adequately. , 

12. Hentschkej Guilbert C. , and Levine, Donald M. Planning* for Evaluat:^on 
in Performance Contracting Experiments: The Connection to Domain- 
Referenced Testing Theory. Educational Technology , Vol. 14, No! -6, 

^ p6ges 38-43. . / . 

This article delineates the im^ct.of iVicorpoi^ting domain-referenced 
testing concepts into perform^ce (^onj^acts with teachers. The 'effects 
- of domain-referenced testing theory on 11 /problems resulting from 
\ this assessment is described. It is suggested that this approactf 
would alleviate spine of th^se testihg difficulties.' 



13. Hively, Wells. Domain Referenced Testing . October 1974. 150 pages. 
Available from Educational Technology Publications, Englewood< Cliff s 
N^w Jersey 07632 ($4.95). * ^ ' 

The central assumption in domain-referenced testing (DRT) , as presented 
in this book, is that a domain may be determined whidb adequately 
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-represents a particular, unfverse ofc-^icnowledge. Aft^ a domain has 
^been established, the technological and practical problem of* using 
domain-ref erenc€i$i testing must be solved. This book contains a 
collection of twelve short chapters covering** such DRT topics^as 
definition and function; sampling plans ; .irfs true jtional accountability; 
curriculum assessment management , and, modification; teaclter, program, 
and product evaluation; relation ot performance contracting Experiments 
,to DRT; individualized instruction; and behavioral §^rowth tracking. 
Brief comments and helpful sources are provided by the editor. 



lively. Wells. Introduction,, to Domain-Reference Testing. Educational 
Technology . Vol. 14, No. 6, June 1*974^ pages 5-10.^ 

lis article describes the- theory and utilization of the domain- 
referenced approacih to the measurement and technology of educational 
objectives. According to this method, sample problems- are generated 
in ways clearly specifiable before the test. Thus, a clearly 
specified domain of con^petence e'xiats and is availabl^e tcT^he^test 
taker prior to the test. Domain-rif etlSQced testing has its roots 
in learning theory and collects data useful in evaluating growth. 
Its more traditional , alt e:;native, norm-referenced testing ^NRT) ^ \ 
ha^ its "Too^s in the study of individual differences so that the 
structure of the*^ content .is not consideri^d important. NRT collects 
jdata useful in prediction- and -a^llection but not in evaluating 
instruction-. Education requires both types of testing but the 
latter has been emphasized tr-Aditionally . ^ ' ^ , 



Hively, Wells,. Ed., and ReTmolds, MaynarcJ C. , Ed. Domain-Referenced 
Testing^ in Special Education . 1975. 146 pages. Available from 
Council^ for Exceptional Children, 1920 Association Driv^, Reston, 
Virginia 22091 ($4.00), Product 101* ^ ' . * 

Presented are eight papers that deal with-the educatj.onal ^.mpli- 
cations for handicapped ''children of domain-ref erenced testltig , as 
contrasted , with standardized norm-referenced achievement' testing^ 
The crucial aspects of each testing model are highlighted by 
W. Hively in an introducto.ry section. M. Reynolds surve);ts past ' 
and present special education pressures and analyzea their impaof: 
on testing. T. Donlon reviews hi&torical and technical concepts 
of' test-score Treferencing and points out complexities and con- 
fusions In terminology among different types of evaluation. Dis- 
cussed by J. Rosner are test construction and utilization in 
connection with an adaptive^ perceptual* skills curriculum. Ex- 
iSlained by A. Hofmeister are pro^cedures and materials fqr training , 
te'achers to integrate criterion-referenced testing and, instruction i 

*within.the regular classroom. The creation of a comprehensive r 
computer-based information* bank in t-he area of reading instruction 
and its uSe in domain-referenced test development is 'described by 
R: OlReilly* fexSmined Is tjie use of domain»-referenced testing in 

- the delivery'' ofe 'specilal Education services inf^a rural area (F-. H'ammar- 



back And C. Koenig) . Ethical considerations in the use of norm- 
domain-, and behavior- referenced testing are considered in the 
-final jfkper by E. Joselyn. Also' included are a 60^item bibliography 
'on /domain-referenced. testing and biographical information about the 
Wthoirs. ^ . ' * • 



.6* Johnston, Thomasfj. Program and Product Evaluation from^ Domain- - 
/Ref-erenced Viewpoint. Educational Technology . Vol. 14, «No/ '6, 
/ June 1974% pa^es 43-48. , ^ . 

Factors' to be consider^tlT^n ^walyzing educational products and 
programs within the domain-ref ejenced testing framework are 
described in this article. This analysis' is discussed in tertos * 
of the characterization of domains and the application of 
domain weighting. ' % 



7. Kwansa, Kofi Bassa. Investigation of the Relative. Content Validity * 
f ^ of Norm-Referenced and Domain-Referenced Arithmetic TestsT ^ Ph. D. 

Dissertation, University of Pittsburgh, 1972.* (UMI .Order No. 73-4153 
' 256/ pages . ) / 

. • ' /. - ' ' 

Norm-teferenced '^nd dortain-ref erenced mfe^hods were each used''t6 * 
build sixth grade arithmetic tests. The tesrts were administered ^ 
to samples of students; and the results used for making content 
validity comparisons between tU.e te^sts. - Findings showed ' th^t Wie ♦ 
^domain-referenced tests had higher Qontjent- validity than the nornf- 
referenced teg^ts, that parallel forms of the norm-referenced tests 
did not show equivalent degrees of content validity between them- 
selves, that scores on the norm-ref er€fnced tests correlated high- 
ly wffch scores on -the domain-referenced tests; and that the domain- 
^ referen(f&d tests had slightly smaller standard 'errors of estimation 
and pr.edictiont than the norm-referenced tests. 

• . » " ' . -i ^ 

.8. M^cready, George Byron. An Investigatic^ into the Nature of 

Interitem Relations iind the Structure of Domaiq Hierarchies Found 
Within 4 Domain Referenced* Testing System. . Dissertation Abstracts ^ 
. International . Vol. 33, No. 5-A, 1972, .page 2174. 

The pur pose. (>f this study was to establish procedural techniques 
which might be helpful, in the assessmenf^of achievement testing 
systems which u^e operationally speci'f ied'^rocedures for both tixs. 
generation ?nd grouping qf items. In addition, this study, ^"^^ 
^ attempted to. assess the relations among -items generated by a 
Domain Referenced Testing System" within the curriculum area of 
multiplication of whole numbers. It was ho^d that s*uch an assess- 
ment wou^cd provide informatiqn on the degree to which it \s pos- 
sible to group such items into sets or '^domains" of ec^uivalent 
items. Such'lnf6rmation was of 'interest when the, operation 
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procedures for grouping items were based on the assumed* processes 
involved in arriving at answers to, the items. It was of further 
interest to determine the order (or partial orderX in which the 
skills necessary^ fcrr correctly answering ^tems from- the various 
.domains were acquired by students. Here an attempt was made ^to 
determine hith the nature and pxtent to which sugh a partial order- 
ing could be' established^ In general, it was possible to infer from 
^ the results of this study that the Domain Referenced Testing System 
. studied provide^ an effective means of grouping items into "sets of 
'•fequiyaleiit" items (i.e., items ^which a given student tended to 
answer either all correctly or all incorrectly). Thus, this testing 
system allowed for an accurate description of hqw students could be 
expected to perform on an entire domain of items on the basis of 
a small /sample of^ items. 



Macready, George B. The Structure of Domain Hierarchies Found' Within- 
a Domain Referenced Testing System. Educational and Psychological 
Measurement . Vol. 35, No. 3, Autumn 1975, pages 583-597. • . , 

In this article conditional states of item mastery found among 
i items from^dif f erent item domains ai^ the effectiveness of various 
pr9cedures for identifying such' conditional relations were assessed. 
The item domains considered were froiQ the curriculum a^ea of multi- 
plication of whole numbers, and were defined by a domain referenced , 
testing system. Data *were gathered during pilot and mai?n studies' 
from a total of 400 5th graders, 'ft was fio^ible to infer from the 
results of this study that the 'domain refe^^ced testing system 
".considered produced items which across dbmains showed strong condi- 
tional relations. Comparisons, of gqodross of fit were made among 
domain hierarchies with similar numbers of specified conditional 
relations generated by 2 difJi^rtiTr^^mpirical procedures and by 
experts^ judgment/. Additional comparisons were made among models 
generated by the same p toe edur^e but with different numbers of ' . 

specified conditional relations. Support for the validity pf 
empirically generated hierarchies with moderate numbers of" condi- 
tional relations among domains was ptovided. 

Macready, George B. , and Merwin, J,ack C. Homogeileity Within Item 
Forms in Domain Referenced. Te€ting. Educational and Psychological 
Measurement . Vol. 33, Nol .2, Summer .1973 pages 351-360. ' 

This article studied the nature of the relationships found in 
•domain-refereii<^d tests ^ong items* within item forms and how 
these r-elationsftips compare with aii ideal case for diagnostic » 
tests in which, if a pe^rsoii gets *1 item vithin ^n item form right, 
then he Wbuld get all items^thin the item form correct. Subjects 
were 91 cjorp^men from 5 randomly chosen Youth Conservation Centers. 
Each sub ject was administered a 75-item test on the multiplication 
of whol^gBbers which had been generated from 25 item forms based 
on intu^^ categories. Results show that, in most cases, item * 
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forms which generate itTems of moderate difficulty can be used to 
obtain relatively homogeneous sets of items of ^equivalent dif'ficulty 
for a defined population of subjects. Such item forms provide sets 
of items .superior to those which would be expected if it^m difficulties 
alone were used to group items into sets. • ' 



Millman, sJason. Criterion^-Ref erenced Measurement^- . In Popham, .W. J., 
Evaluation in Education . Berkeley: * McCutchan Pub. Corp., 1974^. 

Jhis chapter sboxild not only acquaint^ the reader wifh the present 
state of the aVt on Criterion-Reference (CR) measurement but also, 
suggest possible directions for further* inquiiry. The goal of the 
"first part of this chapter is to deal with the. definitional dilenma 
of CR m^surement by proceeding from the more* traditional view of 
GR measurement to one that is more productive and provides a unifying , 
thi^ for the study. The focus of the second part- of the chapter 
is on tests intended .to describe the current status of, an examinee 
with respect to a well-explUcated set of performance -tasks called 
a domain. A random, or stratified random, -sample of items from 
a domaii\ is. called a domain-referenced test (DRT) . Specific topics 
include defining tl\e-item population, selecting test items; establishing 
a passing score, determining test length, and evaluating the DRT. 
Tjests- having the function of discriminating bfetsween individuals or 
groups ^pf individu,als believed ^to differ on the attribute purportedly 
measured by the test are called differential assessment devices (DAD's). 
Some dad's reference a particular objective or skill with sufficient 
specification that a criterion--ref erenced interpretation is reasonable,. 
The development and evaluation of such tests^ labeled CRDAD'ls ±s pre- 
sented in the third section of the chapter. Finally, selected areas 
of application in education which call. -for measuring status qr dif- 
ferentiating ^individuals or groups ar,e discussed. ^ 

f . V 

Millman, Jason. Sampling Plans for Domain-Referenced Tests. Educa- 
tional Technology , Vol. 14, No. '6., June^l974, pages 17-21. 

A way of assigning items in a ,domain-re|erenced testing plan so * ^ 
that examinees encounter, them* orders not affecting their subse- 
cjuent rjesponses is described ire this article. A sampling scheme f or 
carrying thi^ out is presented. Ways of using such a scheme and - 
possible sources of bias are also discussed. 



Nitko, Anthony J. Some Considerations When Using a Domain-Referenced 
System of Achievement Tests in. Instructional Situations . - March 2, 
1970. 24 pages. ED 037 793. / 

The problem of using a domain-referenced system of achievement tests 
is discussed -as . it relates to the, design of instruction. Testing prob- 
lems are discussed from the point of view that the teacher, pupil, and/ 
or automation needs certain kinds of information in order to make 
instructional decisions that are adaptive to the individual learner. 
The design of achievement tests based on item forms is determined by 
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the purpose (s) for which th^ 4-nformatlon obtained from them is 
* needed. The "selection of items from the defined domain of item 

forms is discussed in terms of the purpose for testing, the rela- , 
tionship between items and instruction, and the relationship between 
inst^ctional objectives and item forms. 
> 

*, » 

24., Nitko, Anthony J., and Hsu«, Tse-Chi. Using Domain-Referencec^. Tests 
for Sttident Placement, Diagnosis, and Attainment in a System of 
Adaptive, Individualized Instruction. Educational Technology , 
Vol. 14, No. 6, June 1974, pages 48-54. 

; .This arttiLcle illustrates ways in which domain-referenced testing 

might 'bie used in an adaptive and individualized system of instruction. 
It^is stiggeg^ted that measurement and instruction should be integrated 
into a decision-making context. Examples are provided. 

25. OlW^^a,*?. L. Jr. Repetitive Dotnain-Ref erenced Testing Using 
> Combuters; the TITA System . June 1975. 9 pages. ED 111 358. 

' ^ ^vj '' , , 

The TITA (Totally Interactive Testing and Analysis) System algorithm 
for the repdpfctive construction 'Of domain-referenced tests utilizes 
a' compact data bank, is highly portable, is useful in any discipline, 
requires mode^ computer hardware, and does not present a security 
problem. Clusteifs of related key phrases, statement phrases, and 
distractors form minipools from which the computer generates items 
for a domain-referenced unit of instruction. Test items can take 
the form of multiple-choice, true-false, matching, ^and fill-in 
questions. A random number generator produces data for test items 
requiring numerical solutions, and the correct answer is computed 
from a: coded formula so computational subroutines are not required 
for each test, item. This component of computer managed instructior*> 
allows the instructor to key related items in the data minipool to 
Learning resources and to code the resource's themselves for inclusion 
in the data bank. Use of this system for elementary, secondary, or^ 
undergraduate courses can facilitate instructional management and 
result in positive effects on student morale. 

26. O'Reilly, Robert P., and others. The Validation and Refinement of 
Measures of Literal Comprehension in Reading for Use in Policy 
Rfesearch and Classroom Management . February 1976. 424 pages. 

1^ 133 363. 

The report ^^roposes to complete the validation and refinement of 
a new domain referenced testing technology designed to assess 
literal comprehension ability in students in grades 1-12. The 
domain referenced measures .in this technology, along with other 
more traditional measures of reading comprehension, literal and 
'non-literal, are subseqi^^ently intended to be used in paift in large 
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scaTe studies of productivity in school reading programs, Jo date, 
studies of productivity in reading instruction have had little in- 
fluence on educational decieion-r-inaKing due to serious methodological 
problems, one of the major- prpbl ems being .the lack of adequate 
measures of program- output.' The report further proposes to solve 
a number of important' instructional management problems created by 
the use of the inadequate information available from traditional 
measures of reading ^comprehension. The new domain referenced 
measures of reading comprehension will have an improved basis for 
scaling students an comprehension ability, and ability scores from 
this scale will be referenced to an additional scale defining an 
individual or group's ability to read in several domains of written 
discourse. These scaling features will allow for the assignment of^- 
students to- specific levels of reading-materials in specific in- 
structional or content domains,, a procedure not possible with existing 
measures of reading comprehension. 



Popham, W. James. An Evaluation Guidebook; A Set of Practical Guide- 
lines for the Educational Evaluatof . 1972. 89 pages. Available from 
Instructional Objectives Exchange, -Box 24095, Los Angeles, California 
90024 ($2.50) ^ ' ' ' 

The third chapter of this book discusses the topic of domain-,^* 
referenced tfesting (DRT) in detail.. DRT is .seen as a useful^ 
measuring devlcS in determining whether of not an educatior),al 
objective has been accomplished. Its essential ingredien^t in- 
volves 'Refining the domain of learner behaviors 'called for in the 
objectiA^e, then referencing all test items to this domain. The i ^ 
next procedure in constructing a domain-referenced test is the prep- 
aration of an. item form which contains three necessary elements: (1) 
instructions to students; ^(2) stimulqs - limits ; and (3) response limits. 
These elements are defined and discussed by the author, and two illus- 
trative item forms- are presented. ' . 

V . ' * . * 

Popham,- James W. Teacher Evaluation and Dom*ain-Ref erenced Measure- 
ment. Educational Technology . Vol". 14, No. -6, June 1974, pages 35-37. 

The effect of domain-referenced measurement on teacher evaluation is • 
discussed in this article.^ This approach corrects most of the deficits 
of standaf dize<i tests. Because the domain-referenced approath pro- 
duces cl^r cate^ries of learner behaviors to be measured, it en- 
ables teachers to know better where^ their teaching has not worked. 
Ways of improving teacher performance throu^gh use of this method^ are 
described. ' I ^ 



Sanders^ James R. , and Murray, Stephen L. Alternatives for Achieve-^ 
ment Testing. Educational Technology . Vol* 16, No. i3, March 1^76, 
7 pages. . * Jfc,. 
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Explored are four selected strategies — norm-referenced, criterion-, 
referenced, objecfives-xfef erenced, and domain-referenced testing— of 
achievememt test development, as well as implicat4.ons for their 
application. Each^ type of tefstlng approach is discussed in terms 
of sUch aspects ,as def^Lnition, key emphasis , development procedure, 
it€m selection, necessary input for test development, types of scores 
reported, examples of test interpretation, •recommended uses, and 
inappropriate uses and limitationsN It is concluded that the best 
achievement testing system is probibXy a combination or variation of 
the approaches 4^ * 



^Sjenilon, Donald B. , and Rabehl^ George. Test-Item Domains and In- 
structional Accountability. Educational Technology , Vol^ 1^, No. 6, • 
June 1974, pages 22-28. . . / 

This article discusses the l:^|.story, goals, and implementation of 
instructional accountability in education. Domain-referenced 
testing is suggested as a way of increasing and assessing such > 
accountability^ Application of this approach in 2 school systems 
is d escribed . Evaluation of its success is discussed. 



Whitely, Susan E. Domain Referenced Testing: An Alternative Model 
for Test Construction. Proceedings of the Annual Convention of the 
American Psychological -Association . Vol. 6, Ft. 2, 1971, pages 515-516 

Domain-referenced testing, interpreting sc9res with direct reference 
to the domain of item content, has been given increasing attention in 
recent years. Neither the programed learning approach nor the achieve 
ment test .approach has been abl^ to .provide models that can ^ndle 
complex and heterogeneous domains to allow a domain-referenced score 
interpretation. A moc^ified version of Stephenson's structured' Q- 
^ s'ample model is presented td provide an alternative method of te^t 
construction. It is different ,from current approaches because it 
provides information concerning domain structure and does not depend 
upon random sampling to estimate true score. 



Willoughby, Lee, and others. ' A Comparison of Domain-Referenced and 
Classic Psychometric Test Construction Methods . 1976. -13 pages. 
ED 131 128. 

This situdy -compared a jdomain referenced approach with a traditional 
psychometric approach in the construction of a tfest. Results of the 
December, 1975 Quarterly Profile Exam (QPE) admihistered to 400 exam- 
inees at a university were the source of data. -/The 400 item QPE is 
a five alternative multiple choice test of infp;tmation a "safe" 
physlciaa. should know. Content of the exam covers the broad areas 
of Inten^l ^Medicine, Pediatrics, Obstetrics/Gynecology, Surgery, 
and Basic Science, as well as additional* sub-topics. For purposes 
of this study, two 75 item tests were constructed by pulling from 



the 400 item QPE^by two different atrajfcegies. ' The domain Referenced 
approach was used to construct a 75 item test by a random sample of 
the 400 ite^s. Selection of the 75 ite*B .iiri.th.the ,M 
biserial it^flii-tptal correlations represei^ed the traditional psycho-, 
metric approach to test construction., 
to obtain scores and item analysis data 
'metric tests. Then, the # two tests wer« 
distribution of p values (the « proportion answering aii item cdrr^tly), 
point biserial item-total correlations, student scores acrossi medical^ 
school year level aind reliability. .The results were discussed with 
regard to their consistency with expec^tions of the domain referenced 
and psychometric approaches. .f' 



The exams were, then rescored 
pn the random and psycho- 
ibompared with respect tc 
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